Optimizing Similarity Search in the M-Tree

نویسندگان

  • Steffen Guhlemann
  • Uwe Petersohn
  • Klaus Meyer-Wegener
چکیده

A topic of growing interest in a wide range of domains is the similarity of data entries. Data sets of genome sequences, text corpora, complex production information, and multimedia content are typically large and unstructured, and it is expensive to compute similarities in them. The only common denominator a data structure for eicient similarity search can rely on are the metric axioms. One such data structure for eicient similarity search in metric spaces is the M-Tree, along with a number of compatible extensions (e.g. Slim-Tree, Bulk Loaded M-Tree, multiway insertion M-Tree, M2-Tree, etc.). The M-Tree family uses common algorithms for the k-nearest-neighbor and range search. In this paper we present new algorithms for these tasks to considerably improve retrieval performance of all M-Tree-compatible data structures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Voltage Sag Compensation with DVR in Power Distribution System Based on Improved Cuckoo Search Tree-Fuzzy Rule Based Classifier Algorithm

A new technique presents to improve the performance of dynamic voltage restorer (DVR) for voltage sag mitigation. This control scheme is based on cuckoo search algorithm with tree fuzzy rule based classifier (CSA-TFRC). CSA is used for optimizing the output of TFRC so the classification output of the network is enhanced. While, the combination of cuckoo search algorithm, fuzzy and decision tree...

متن کامل

ارزیابی خودکار جویش‌گرهای ویدئویی حوزه وب فارسی بر اساس تجمیع آرا

Today, the growth of the internet and its high influence in individuals’ life have caused many users to solve their daily needs by search engines and hence, the search engines need to be modified and continuously improved. Therefore, evaluating search engines to determine their performance is of paramount importance. In Iran, as well as other countries, extensive researches are being performed ...

متن کامل

Optimizing Cost Function in Imperialist Competitive Algorithm for Path Coverage Problem in Software Testing

Search-based optimization methods have been used for software engineering activities such as software testing. In the field of software testing, search-based test data generation refers to application of meta-heuristic optimization methods to generate test data that cover the code space of a program. Automatic test data generation that can cover all the paths of software is known as a major cha...

متن کامل

First distribution record of regular echinoids (Echinodermata; Echinoidea) from Chennai Coast,South India

The regular echinoids were recorded from Chennai Coast,Tamilnadu, South India and the animals were belong to 4 families, 5 genera and 5 species. An identification key to generic level and synoptic description are provided. Temnopleurid sea urchin Salmaciella oligopora (Clark, 1916) was recorded for the first time in 20-30m depth between Chennai and Pondicherry Coasts, South East Coast of India....

متن کامل

Pivoting M-tree: A Metric Access Method for Efficient Similarity Search

In this paper pivoting M-tree (PM-tree) is introduced, a metric access method combining M-tree with the pivot-based approach. While in M-tree a metric region is represented by a hyper-sphere, in PMtree the shape of a metric region is determined as an intersection of the hyper-sphere and a set of hyper-rings. The set of hyper-rings for each metric region is related to a fixed set of pivot object...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017